NR-grep: a fast and flexible pattern-matching tool
نویسنده
چکیده
We present nrgrep (\nondeterministic reverse grep"), a new pattern matching tool designed for eecient search of complex patterns. Unlike previous tools of the grep family, such as agrep and Gnu grep, nrgrep is based on a single and uniform concept: the bit-parallel simulation of a nondeterministic suux automaton. As a result, nrgrep can nd from simple patterns to regular expressions, exactly or allowing errors in the matches, with an eeciency that degrades smoothly as the complexity of the searched pattern increases. Another concept fully integrated into nrgrep and that contributes to this smoothness is the selection of adequate subpatterns for fast scanning, which is also absent in many current tools. We show that the eeciency of nrgrep is similar to that of the fastest existing string matching tools for the simplest patterns, and by far unpaired for more complex patterns.
منابع مشابه
PatMatch: a program for finding patterns in peptide and nucleotide sequences
Here, we present PatMatch, an efficient, web-based pattern-matching program that enables searches for short nucleotide or peptide sequences such as cis-elements in nucleotide sequences or small domains and motifs in protein sequences. The program can be used to find matches to a user-specified sequence pattern that can be described using ambiguous sequence codes and a powerful and flexible patt...
متن کاملEnhancing GNU grep
The UNIX grep utility searches the input files selecting lines matching one or more patterns. Searching for patterns in text is an important operation in a number of domains, including program comprehension and software maintenance, structured text databases, indexing file systems, and searching natural language texts. Such a wide range of uses inspired the development of variations of the orig...
متن کاملSurvey of Global Regular Expression Print ( GREP ) Tools
The UNIX grep utility marked the birth of a global regular expression print (GREP) tools. Searching for patterns in text is important operation in a number of domains, including program comprehension and software maintenance, structured text databases, indexing file systems, and searching natural language texts. Such a wide range of uses inspired the development of variations of the original UN...
متن کاملA Fast Multiple String-pattern Matching Algorithm
In this paper, we propose a simple but eecient multiple string pattern matching algorithm based on a compact encoding scheme. This algorithm scans text from left to right while encoding characters in the text based on the alphabet that occurs in the input patterns. The simple scanning algorithm demonstrates the ability to handle a very large number of input patterns simultaneously. Our experime...
متن کاملSemantic Grep: Regular Expressions + Relational Abstraction
Searching source code is one of the most common activities of software engineers. Text editors and other support tools normally provide searching based on lexical expressions (regular expressions). Some more advanced editors provide a way to add semantic direction to some of the searches. Recent research has focused on advancing the semantic options available to text-based queries. Most of thes...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Softw., Pract. Exper.
دوره 31 شماره
صفحات -
تاریخ انتشار 2001